DS808: Visualization project

By Josephine Plass-Nielsen and Ulrikke Jørgensen

Domain situation

The National Center of Environmental Information has the biggest archive of registred earthquakes along with features as:

  • Country
  • Lattidude
  • Longitude
  • Focal depth
  • Year
  • Magnitude
  • Total number of death
  • Total number of missing
  • Total number of injuries
  • Total number of houses damaged
  • Total number of houses destroyed

Target user

Our target user is ordinary (non-expert) people that are interested in knowing more about earthquakes.
An example could be a person who wants to see the distribution of earthquakes in the world, or wants to discover correlations between the features of earthquakes.

WHAT - Data abstraction

In [16]:
data 
Out[16]:
Year Country Latitude Longitude Focal Depth (km) Mag Total Deaths Total Missing Total Injuries Total Houses Destroyed Total Houses Damaged
0 1935 USA 46.600 -112.000 0 6.2 2 0 0 300 0
1 1935 CHINA 29.400 102.300 0 6.0 2 0 0 0 0
2 1935 USA 46.600 -112.000 0 6.0 2 0 0 0 0
3 1935 CHINA 28.700 103.600 0 6.0 100 0 0 0 0
4 1935 TAIWAN 24.600 120.800 30 6.5 2746 0 6004 30000 0
... ... ... ... ... ... ... ... ... ... ... ...
2503 2020 USA 40.751 -112.078 12 5.7 0 0 0 0 0
2504 2020 NEW ZEALAND -33.294 -177.838 10 7.4 0 0 0 0 0
2505 2020 CHINA 33.124 98.916 10 5.3 0 0 0 0 790
2506 2020 INDONESIA -6.808 106.676 23 5.0 0 0 4 0 1137
2507 2020 USA 55.030 -158.522 28 7.8 0 0 0 0 0

2508 rows × 11 columns

What conclusions can be derived from the data?

In [19]:
corrMatrix = data_sub.corr()
sn.heatmap(corrMatrix, annot=True)
Out[19]:
<matplotlib.axes._subplots.AxesSubplot at 0x237ee4769c8>
In [20]:
x = pd.plotting.scatter_matrix(data_sub, diagonal='hist', figsize=(14,10))   
for ax in x.ravel():
    ax.set_xlabel(ax.get_xlabel(), fontsize = 7.5, rotation = -15); ax.set_ylabel(ax.get_ylabel(), fontsize = 8, rotation=83)
In [44]:
plt.scatter(data_sub['Year'],data_sub['Mag'])
plt.ylabel('Mag')
plt.xlabel('Year')
plt.show()
In [21]:
plt.plot(data_sub['Year'],data_sub['Total Deaths'])
plt.ylabel('Total Deaths')
plt.xlabel('Year')
plt.show()

WHY - Task abstraction

Action and target:

  • Locate exstremes
  • Summarize distribution

  • Why do i want to use visualisation for my problem?

  • Do the visualization support the underlining task?

HOW - Visual encoding/interaction idiom

Which design decisions are taken?

Mapping data features to visual features:
Glyph map: Earthquake = glyph
Place (latitude, longitude) = World map
Number of death = color of glyph
Magnitude = size of glyph
Year = timeline

From magnitude to realistic size of glyph

Filtering data:

  • From year 1935 and after
  • From magnitude 5 and above
In [3]:
fig = px.scatter_geo(data_test, lat=data_test["Latitude"], lon=data_test["Longitude"], color=data_test["Total Deaths"], hover_name=data_test["Country"], hover_data={'Latitude': False,'Longitude': False,'Bubble Size': False,'Year': True,'Mag': True,'Total Deaths': True}, size=data_test['Bubble Size'],size_max = 60, color_continuous_scale=["orange", "red", "brown", "black"], range_color=(0,316000), opacity=1, animation_frame=data_test['Year'], animation_group=data_test["Year"], projection="hammer")
fig.show()